Hybrid word-subword decoding for spoken term detection
نویسندگان
چکیده
This paper deals with a hybrid word-subword recognition system for spoken term detection. The decoding is driven by a hybrid recognition network and the decoder directly produces hybrid word-subword lattices. One phone and two multigram models were tested to represent sub-word units. The systems were evaluated in terms of spoken term detection accuracy and the size of index. We concluded that the best subword model for hybrid word-subword recognition is the multigram model trained on the word recognizer vocabulary. We achieved an improvement in word recognition accuracy, and in spoken term detection accuracy when in-vocabulary and out-of-vocabulary terms are searched separately. Spoken term detection accuracy with the full (in-vocabulary and out-of-vocabulary) term set was slightly worse but the required index size was significantly reduced.
منابع مشابه
Hybrid word-subword spoken term detection
The thesis investigates into keyword spotting and spoken term detection (STD), that are considered as sub-sets of spoken document retrieval. It deals with two-phase approaches where speech is first processed by speech recognizer, and the search for queries is performed in the output of this recognizer. Standard large vocabulary continuous speech recognizer (LVCSR) with fixed vocabulary is not c...
متن کاملMerging search spaces for subword spoken term detection
We describe how complementary search spaces, addressed by two different methods used in Spoken Term Detection (STD), can be merged for German subword STD. We propose fuzzysearch techniques on lattices to narrow the gap between subword and word retrieval. The first technique is based on an edit-distance, where no a priori knowledge about confusions is employed. Additionally, we propose a weighti...
متن کاملA robust fusion method for multilingual spoken document retrieval systems employing tiered resources
In this study, we present two novel fusion approaches to merge subword and word based retrieval methods within a multilingual spoken document retrieval (SDR) system. Considering the fact that more than 6000 languages are spoken in the world today, resources (e.g., text and audio data, pronunciation lexicon) needed to develop Automatic Speech Recognition (ASR) systems for such a range of languag...
متن کاملAn approach for efficient open vocabulary spoken term detection
A hybrid two-pass approach for facilitating fast and efficient open vocabulary spoken term detection (STD) is presented in this paper. A large vocabulary continuous speech recognition (LVCSR) system is deployed for producing word lattices from audio recordings. An index construction technique is used for facilitating very fast search of lattices for finding occurrences of both in vocabulary (IV...
متن کاملComparing decoding strategies for subword-based keyword spotting in low-resourced languages
For languages with limited training resources, out-ofvocabulary (OOV) words are a significant problem, both for transcription and keyword spotting. This paper investigates the use of subword lexical units for keyword spotting. Three strategies for using the sub-word units are explored: 1) converting word-based lattices to subword lattices after decoding, 2) performing a separate decoding for ea...
متن کامل